algorithmic fidelity
Large Scale Community-Aware Network Generation
Ramavarapu, Vikram, Lamy, João Alfredo Cardoso, Dindoost, Mohammad, Bader, David A.
Community detection, or network clustering, is used to identify latent community structure in networks. Due to the scarcity of labeled ground truth in real-world networks, evaluating these algorithms poses significant challenges. To address this, researchers use synthetic network generators that produce networks with ground-truth community labels. RECCS is one such algorithm that takes a network and its clustering as input and generates a synthetic network through a modular pipeline. Each generated ground truth cluster preserves key characteristics of the corresponding input cluster, including connectivity, minimum degree, and degree sequence distribution. The output consists of a synthetically generated network, and disjoint ground truth cluster labels for all nodes. In this paper, we present two enhanced versions: RECCS+ and RECCS++. RECCS+ maintains algorithmic fidelity to the original RECCS while introducing parallelization through an orchestrator that coordinates algorithmic components across multiple processes and employs multithreading. RECCS++ builds upon this foundation with additional algorithmic optimizations to achieve further speedup. Our experimental results demonstrate that RECCS+ and RECCS++ achieve speedups of up to 49x and 139x respectively on our benchmark datasets, with RECCS++'s additional performance gains involving a modest accuracy tradeoff. With this newfound performance, RECCS++ can now scale to networks with over 100 million nodes and nearly 2 billion edges.
- Europe > Netherlands > South Holland > Leiden (0.08)
- North America > United States > Illinois (0.05)
- South America > Brazil (0.04)
- (3 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Social Media (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Using Large Language Models to Simulate Human Behavioural Experiments: Port of Mars
Slumbers, Oliver, Leibo, Joel Z., Janssen, Marco A.
Collective risk social dilemmas (CRSD) highlight a trade-off between individual preferences and the need for all to contribute toward achieving a group objective. Problems such as climate change are in this category, and so it is critical to understand their social underpinnings. However, rigorous CRSD methodology often demands large-scale human experiments but it is difficult to guarantee sufficient power and heterogeneity over socio-demographic factors. Generative AI offers a potential complementary approach to address thisproblem. By replacing human participants with large language models (LLM), it allows for a scalable empirical framework. This paper focuses on the validity of this approach and whether it is feasible to represent a large-scale human-like experiment with sufficient diversity using LLM. In particular, where previous literature has focused on political surveys, virtual towns and classical game-theoretic examples, we focus on a complex CRSD used in the institutional economics and sustainability literature known as Port of Mars
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Government (0.92)
- Leisure & Entertainment > Games (0.67)
- Health & Medicine > Therapeutic Area (0.67)
- Health & Medicine > Consumer Health (0.46)
LLM Generated Distribution-Based Prediction of US Electoral Results, Part I
Bradshaw, Caleb, Miller, Caelen, Warnick, Sean
This paper introduces distribution-based prediction, a novel approach to using Large Language Models (LLMs) as predictive tools by interpreting output token probabilities as distributions representing the models' learned representation of the world. This distribution-based nature offers an alternative perspective for analyzing algorithmic fidelity, complementing the approach used in silicon sampling. We demonstrate the use of distribution-based prediction in the context of recent United States presidential election, showing that this method can be used to determine task specific bias, prompt noise, and algorithmic fidelity. This approach has significant implications for assessing the reliability and increasing transparency of LLM-based predictions across various domains.
- Europe (0.14)
- North America > United States > California (0.06)
- North America > United States > Alabama (0.06)
- (45 more...)
Can Large Language Models Capture Public Opinion about Global Warming? An Empirical Assessment of Algorithmic Fidelity and Bias
Lee, S., Peng, T. Q., Goldberg, M. H., Rosenthal, S. A., Kotcher, J. E., Maibach, E. W., Leiserowitz, A.
Large language models (LLMs) have demonstrated their potential in social science research by emulating human perceptions and behaviors, a concept referred to as algorithmic fidelity. The LLMs were conditioned on demographics and/or psychological covariates to simulate survey responses. The findings indicate that LLMs can effectively capture presidential voting behaviors but encounter challenges in accurately representing global warming perspectives when relevant covariates are not included. GPT-4 exhibits improved performance when conditioned on both demographics and covariates. However, disparities emerge in LLM estimations of the views of certain groups, with LLMs tending to underestimate worry about global warming among Black Americans. While highlighting the potential of LLMs to aid social science research, these results underscore the importance of meticulous conditioning, model selection, survey question format, and bias assessment when employing LLMs for survey simulation. Further investigation into prompt engineering and algorithm auditing is essential to harness the power of LLMs while addressing their inherent limitations. Keywords: Global warming; large language models; algorithmic fidelity; public opinion 1. Introduction It is very important to measure public opinion about global warming, as these opinions can have considerable influence over policy-making decisions (Bromley-Trujillo & Poe, 2020) and shape public behavior (Doherty & Webler, 2016). A primary method employed by scholars and policymakers for measuring and assessing these opinions is through representative surveys (Berinsky, 2017). However, the extensive time and financial resources required for these surveys can hinder the timely tracking of evolving public opinions about global warming. Resource constraints can also lead to an unintended bias towards majority opinions, potentially neglecting the perspectives of minority groups due to their typically smaller sample sizes in national representative surveys. Nonetheless, understanding diverse public opinion regarding global warming is also vital for climate justice. This understanding can promote equitable decisionmaking, elevate the concerns of vulnerable communities, help align climate policies with democratic principles, build public support, and address disparities in climate change awareness and priorities. Furthermore, understanding the diversity of public opinion can help support a just transition and mobilize support for climate justice initiatives.
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity
Amirova, Aliya, Fteropoulli, Theodora, Ahmed, Nafiso, Cowie, Martin R., Leibo, Joel Z.
Today, using Large-scale generative Language Models (LLMs) it is possible to simulate free responses to interview questions like those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative methods aiming to produce insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a term introduced by Argyle et al. (2023) capturing the degree to which LLM-generated outputs mirror human sub-populations' beliefs and attitudes. By definition, high algorithmic fidelity suggests latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with silicon participants matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of the hyper-accuracy distortion described by Aher et al. (2023). We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect research on it to generalize to human populations. However, the rapid pace of LLM research makes it plausible this could change in the future. Thus we stress the need to establish epistemic norms now around how to assess validity of LLM-based qualitative research, especially concerning the need to ensure representation of heterogeneous lived experiences.
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Personal > Interview (1.00)
Out of One, Many: Using Language Models to Simulate Human Samples
Argyle, Lisa P., Busby, Ethan C., Fulda, Nancy, Gubler, Joshua, Rytting, Christopher, Wingate, David
We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property "algorithmic fidelity" and explore its extent in GPT-3. We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.
- North America > United States > New York (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Wisconsin (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.67)
- Law (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)